library(brms)
library(cmdstanr)
library(tidyverse)
library(bayesplot)
library(patchwork)
# cmdstanr::set_cmdstan_path("/opt/cmdstan/cmdstan-2.37.0")
my_theme <- theme_bw() + theme(
axis.text = element_text(size = 14),
strip.text = element_text(size = 14)
)
theme_set(my_theme)Practical Bayesian Modeling with Stan and brms
1 Docker (optional)
Run this on your terminal to start an RStudio Server with all the necessary packages installed.
docker run -d -p 8787:8787 \
-e PASSWORD=123456 \
mattocci/afec-bayes:prod
Access 127.0.0.1:8787 and log in with the username rstudio and the password 123456 (or your own password).
Note: Because you’re running this container on your own machine and mapping only to 127.0.0.1:8787, the RStudio Server isn’t accessible from outside your computer. So, using a simple password like 123456 is fine and safe for local development.
2 Setup
library(tictoc)3 Normal model
Leaf mass per area (LMA) is an important trait that reflects plant strategies. LMA for interspecific data often shows a log-normal distribution.
\[ \text{log}(y_i) \sim N(\mu, \sigma) \]
Q: What are the mean (\(\mu\)) and standard deviation (\(\sigma\)) of log LMA?
3.1 Dummy data
set.seed(123)
n <- 100
mu <- 4.6
sigma <- 0.7
y <- rnorm(n, mean = mu, sd = sigma)
hist(y)cmdstan uses list format for data input.
normal_list <- list(
y = y,
N = length(y)
)brms uses dataframe (tibble) format for data input like lme4::lmer and stats::glm.
normal_df <- tibble(
y = y,
N = length(y)
)3.2 CmdStan
We need to write and save Stan programs (.stan files).
stan/normal.stan
// observed data
data {
int<lower=0> N;
vector[N] y;
}
// parameters to be estimated
parameters {
real mu;
real<lower=0> sigma;
}
model {
// likelihood
y ~ normal(mu, sigma);
// priors
mu ~ normal(0, 10);
sigma ~ cauchy(0, 2.5);
} stan/normal_vague.stan
data {
int<lower=0> N;
vector[N] y;
}
parameters {
real mu;
real<lower=0> sigma;
}
model {
y ~ normal(mu, sigma);
mu ~ normal(0, 1e+4); // Vague prior for mu
sigma ~ normal(0, 1e+4); // Vague prior for sigma
} These commands compile the models, translating Stan code to C++ and then into machine-executable files. This takes a while (about 20 secs to 1 min) the first time only.
normal_mod <- cmdstan_model("stan/normal.stan")
normal_vague_mod <- cmdstan_model("stan/normal_vague.stan")# Persist CSV outputs to avoid /tmp cleanup during render
normal_out_dir <- file.path("assets", "cmdstan", "normal")
if (!dir.exists(normal_out_dir)) dir.create(normal_out_dir, recursive = TRUE)
normal_fit <- normal_mod$sample(
data = normal_list,
seed = 123,
chains = 4,
parallel_chains = 4,
iter_warmup = 1000, # number of warmup iterations
iter_sampling = 1000, # number of sampling iterations
refresh = 500, # print update every 500 iters
output_dir = normal_out_dir
)
#> Running MCMC with 4 parallel chains...
#>
#> Chain 1 Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 1 Iteration: 500 / 2000 [ 25%] (Warmup)
#> Chain 1 Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 1 Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 1 Iteration: 1500 / 2000 [ 75%] (Sampling)
#> Chain 1 Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 2 Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 2 Iteration: 500 / 2000 [ 25%] (Warmup)
#> Chain 2 Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 2 Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 2 Iteration: 1500 / 2000 [ 75%] (Sampling)
#> Chain 2 Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 3 Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 3 Iteration: 500 / 2000 [ 25%] (Warmup)
#> Chain 3 Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 3 Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 3 Iteration: 1500 / 2000 [ 75%] (Sampling)
#> Chain 3 Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 4 Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 4 Iteration: 500 / 2000 [ 25%] (Warmup)
#> Chain 4 Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 4 Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 4 Iteration: 1500 / 2000 [ 75%] (Sampling)
#> Chain 4 Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 1 finished in 0.0 seconds.
#> Chain 2 finished in 0.0 seconds.
#> Chain 3 finished in 0.0 seconds.
#> Chain 4 finished in 0.0 seconds.
#>
#> All 4 chains finished successfully.
#> Mean chain execution time: 0.0 seconds.
#> Total execution time: 0.2 seconds.We don’t have to recompile the model when we want to change the number of iterations, chains, or data.
normal_vague_fit <- normal_vague_mod$sample(
data = normal_list,
seed = 123,
chains = 4,
parallel_chains = 4,
refresh = 500 # print update every 500 iters
)3.2.1 Summary
We use this output when writing manuscripts and for model diagnostics.
normal_fit$summary()
#> # A tibble: 3 × 10
#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 lp__ -6.34 -6.03 1.00 0.721 -8.30 -5.39 1.00 1927. 2334.
#> 2 mu 4.66 4.66 0.0659 0.0651 4.55 4.77 1.000 3370. 2293.
#> 3 sigma 0.646 0.643 0.0456 0.0455 0.575 0.725 1.00 3819. 2828.lp__: log posterior.- \(\text{log}\; p(\theta \mid \text{data}) \propto \text{log}\; p(\text{data} \mid \theta) + \text{log}\; p(\theta)\)
rhat(Gelman-Rubin statistic): should be close to 1.- Measures the convergence of the chains.
rhat> 1.1: Definitely bad.- 1.01 <
rhat< 1.1: Suspicious. rhat<= 1.01: Good.
ess_bulkandess_tail: Effective sample size for bulk and tail of the posterior distribution.ess_bulk: sampling efficiency for the bulk posteriors (e.g., mean, SD).ess_tail: sampling efficiency for the tails (e.g., quantiles like 2.5%, 97.5%).ess_bulk,ess_tail> 400: Good.
3.2.2 Draws (posterior samples)
For each parameter, we have 1000 iterations \(\times\) 4 chains = 4000 posteriors. We use this for visualization and diagnostics.
normal_fit$draws()
#> # A draws_array: 1000 iterations, 4 chains, and 3 variables
#> , , variable = lp__
#>
#> chain
#> iteration 1 2 3 4
#> 1 -5.4 -5.5 -10.3 -5.5
#> 2 -5.5 -6.8 -5.4 -5.7
#> 3 -5.5 -6.7 -6.7 -6.1
#> 4 -6.3 -6.1 -5.8 -6.8
#> 5 -6.1 -5.5 -5.7 -6.0
#>
#> , , variable = mu
#>
#> chain
#> iteration 1 2 3 4
#> 1 4.6 4.6 4.5 4.7
#> 2 4.6 4.8 4.7 4.7
#> 3 4.6 4.8 4.6 4.6
#> 4 4.7 4.7 4.7 4.7
#> 5 4.7 4.6 4.7 4.6
#>
#> , , variable = sigma
#>
#> chain
#> iteration 1 2 3 4
#> 1 0.65 0.66 0.63 0.63
#> 2 0.64 0.67 0.62 0.66
#> 3 0.63 0.64 0.70 0.60
#> 4 0.68 0.67 0.68 0.57
#> 5 0.67 0.65 0.60 0.59
#>
#> # ... with 995 more iterationsTrace plots for diagnosing convergence and mixing of the chains.
normal_draws <- as_draws_df(normal_fit$draws())
color_scheme_set("viridis")
mcmc_trace(normal_draws, pars = c("mu", "sigma"))Histograms of the posterior distributions of the parameters.
mcmc_hist(normal_draws, pars = c("mu", "sigma"))3.2.3 Diagnostic summary
normal_fit$diagnostic_summary()
#> $num_divergent
#> [1] 0 0 0 0
#>
#> $num_max_treedepth
#> [1] 0 0 0 0
#>
#> $ebfmi
#> [1] 1.005079 1.074466 1.096794 1.190007num_divergent: indicates the number of iterations (sampling transitions) where the Hamiltonian trajectory failed (divergent transition).- Even 1-2 divergent transitions suggest that model may not be reliable, especially in hierarchical models.
- We need reparameterization or increase
adapt_deltato make the sampler take smaller steps.
num_max_treedepth: indicates the number of iterations where there are not enough leapfrog steps.- This can indicate that the model is complex or that the priors are too tight.
- We can increase
max_treedepthto allow more steps, but this can increase the computation time.
ebfmi: Energy-Bayesian Fraction of Missing Information- Measures whether the resampled momentum generates enough variation in energy to explore the posterior efficiently.
- Low
ebfmimeans the sampler may be stuck in tight regions and exploring poorly, even if Rhat and ESS look fine. - Guidelines:
ebfmi< 0.3: Bad- 0.3 <
ebfmi<= 1: Acceptable ebfmi>= 1: Good
3.2.4 Methods text example
“We estimated posterior distributions of all parameters using the Hamiltonian Monte Carlo (HMC) algorithm implemented in Stan (Carpenter et al., 2017) with weakly informative priors (Gelman et al., 2008). The HMC algorithm uses gradient information to propose new states in the Markov chain, leading to a more efficient exploration of the target distribution than traditional Markov chain Monte Carlo (MCMC) methods that rely on random proposals (Carpenter et al., 2017). This efficiency allows us to achieve convergence with fewer iterations than traditional MCMC methods. Four independent chains were run for 2000 iterations for each model with a warm-up of 1000 iterations. Convergence of the posterior distribution was assessed using the Gelman–Rubin statistic with a convergence threshold of 1.1 (Gelman et al., 2013), ensuring effective sample sizes greater than 400 (Vehtari et al., 2021), and by monitoring divergent transitions (Betancourt, 2016) for all parameters.”
3.3 brms
brms uses an lme4-like syntax.
normal_fit_brm0 <- brm(y ~ 1, family = gaussian(), data = normal_df)We can also specify priors for the parameters. The prior argument can be a bit tricky, since it is not always obvious which parameter names correspond to which parts of the model, especially when the model becomes more complex.
We can also control the number of warmup iterations, total iterations, and other sampling settings.
The cmdstanr backend is generally faster than the default rstan backend.
normal_fit_brm <- brm(
y ~ 1,
family = gaussian(),
data = normal_df,
prior = c(
prior(normal(0, 5), class = "Intercept"),
prior(cauchy(0, 2.5), class = "sigma")
),
seed = 123,
iter = 2000, # total iterations (warmup + post-warmup)
warmup = 1000,
chains = 4, cores = 4,
backend = "cmdstanr"
)
#> Running MCMC with 4 parallel chains...
#>
#> Chain 1 Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 1 Iteration: 100 / 2000 [ 5%] (Warmup)
#> Chain 1 Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 1 Iteration: 300 / 2000 [ 15%] (Warmup)
#> Chain 1 Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 1 Iteration: 500 / 2000 [ 25%] (Warmup)
#> Chain 1 Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 1 Iteration: 700 / 2000 [ 35%] (Warmup)
#> Chain 1 Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 1 Iteration: 900 / 2000 [ 45%] (Warmup)
#> Chain 1 Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 1 Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 1 Iteration: 1100 / 2000 [ 55%] (Sampling)
#> Chain 1 Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 1 Iteration: 1300 / 2000 [ 65%] (Sampling)
#> Chain 1 Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 1 Iteration: 1500 / 2000 [ 75%] (Sampling)
#> Chain 1 Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 1 Iteration: 1700 / 2000 [ 85%] (Sampling)
#> Chain 1 Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 1 Iteration: 1900 / 2000 [ 95%] (Sampling)
#> Chain 1 Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 2 Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 2 Iteration: 100 / 2000 [ 5%] (Warmup)
#> Chain 2 Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 2 Iteration: 300 / 2000 [ 15%] (Warmup)
#> Chain 2 Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 2 Iteration: 500 / 2000 [ 25%] (Warmup)
#> Chain 2 Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 2 Iteration: 700 / 2000 [ 35%] (Warmup)
#> Chain 2 Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 2 Iteration: 900 / 2000 [ 45%] (Warmup)
#> Chain 2 Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 2 Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 2 Iteration: 1100 / 2000 [ 55%] (Sampling)
#> Chain 2 Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 2 Iteration: 1300 / 2000 [ 65%] (Sampling)
#> Chain 2 Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 2 Iteration: 1500 / 2000 [ 75%] (Sampling)
#> Chain 2 Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 2 Iteration: 1700 / 2000 [ 85%] (Sampling)
#> Chain 2 Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 2 Iteration: 1900 / 2000 [ 95%] (Sampling)
#> Chain 2 Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 3 Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 3 Iteration: 100 / 2000 [ 5%] (Warmup)
#> Chain 3 Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 3 Iteration: 300 / 2000 [ 15%] (Warmup)
#> Chain 3 Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 3 Iteration: 500 / 2000 [ 25%] (Warmup)
#> Chain 3 Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 3 Iteration: 700 / 2000 [ 35%] (Warmup)
#> Chain 3 Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 3 Iteration: 900 / 2000 [ 45%] (Warmup)
#> Chain 3 Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 3 Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 3 Iteration: 1100 / 2000 [ 55%] (Sampling)
#> Chain 3 Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 3 Iteration: 1300 / 2000 [ 65%] (Sampling)
#> Chain 3 Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 3 Iteration: 1500 / 2000 [ 75%] (Sampling)
#> Chain 3 Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 3 Iteration: 1700 / 2000 [ 85%] (Sampling)
#> Chain 3 Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 3 Iteration: 1900 / 2000 [ 95%] (Sampling)
#> Chain 3 Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 4 Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 4 Iteration: 100 / 2000 [ 5%] (Warmup)
#> Chain 4 Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 4 Iteration: 300 / 2000 [ 15%] (Warmup)
#> Chain 4 Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 4 Iteration: 500 / 2000 [ 25%] (Warmup)
#> Chain 4 Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 4 Iteration: 700 / 2000 [ 35%] (Warmup)
#> Chain 4 Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 4 Iteration: 900 / 2000 [ 45%] (Warmup)
#> Chain 4 Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 4 Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 4 Iteration: 1100 / 2000 [ 55%] (Sampling)
#> Chain 4 Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 4 Iteration: 1300 / 2000 [ 65%] (Sampling)
#> Chain 4 Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 4 Iteration: 1500 / 2000 [ 75%] (Sampling)
#> Chain 4 Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 4 Iteration: 1700 / 2000 [ 85%] (Sampling)
#> Chain 4 Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 4 Iteration: 1900 / 2000 [ 95%] (Sampling)
#> Chain 4 Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 1 finished in 0.0 seconds.
#> Chain 2 finished in 0.0 seconds.
#> Chain 3 finished in 0.0 seconds.
#> Chain 4 finished in 0.0 seconds.
#>
#> All 4 chains finished successfully.
#> Mean chain execution time: 0.0 seconds.
#> Total execution time: 0.2 seconds.3.3.1 Summary
summary(normal_fit_brm)
#> Family: gaussian
#> Links: mu = identity
#> Formula: y ~ 1
#> Data: normal_df (Number of observations: 100)
#> Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
#> total post-warmup draws = 4000
#>
#> Regression Coefficients:
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> Intercept 4.66 0.06 4.53 4.79 1.00 3017 2207
#>
#> Further Distributional Parameters:
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> sigma 0.65 0.05 0.56 0.75 1.00 3322 2774
#>
#> Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).3.3.2 Draws (posterior samples)
normal_draws_brm <- posterior::as_draws_df(normal_fit_brm)
normal_draws_brm
#> # A draws_df: 1000 iterations, 4 chains, and 5 variables
#> b_Intercept sigma Intercept lprior lp__
#> 1 4.8 0.63 4.8 -4.4 -102
#> 2 4.5 0.73 4.5 -4.4 -104
#> 3 4.5 0.76 4.5 -4.4 -105
#> 4 4.8 0.70 4.8 -4.4 -103
#> 5 4.8 0.61 4.8 -4.4 -104
#> 6 4.7 0.66 4.7 -4.4 -102
#> 7 4.6 0.63 4.6 -4.4 -102
#> 8 4.6 0.63 4.6 -4.4 -102
#> 9 4.7 0.69 4.7 -4.4 -103
#> 10 4.6 0.60 4.6 -4.4 -102
#> # ... with 3990 more draws
#> # ... hidden reserved variables {'.chain', '.iteration', '.draw'}mcmc_trace(normal_draws_brm, pars = c("b_Intercept", "sigma"))
mcmc_hist(normal_draws_brm, pars = c("b_Intercept", "sigma"))3.3.3 Diagnostic summary
rstan::check_hmc_diagnostics(normal_fit_brm$fit)
#>
#> Divergences:
#>
#> Tree depth:
#>
#> Energy:4 Poisson model
Count data (e.g., species richness, seed counts, pollinator visits…) often follow a Poisson distribution.
\[ y_i \sim \text{Poisson}(\lambda) \]
\[ y_i \sim \operatorname{Poisson\_log}(\log \lambda) \]
Q: What are the mean and variance (\(\lambda\)) of species richness per plot?
4.1 Dummy data
set.seed(123)
y <- rpois(n, lambda = 12.3)
pois_list <- list(
y = y,
N = n
)
pois_df <- tibble(
y = y,
N = n
)4.2 CmdStan
stan/pois.stan
data {
int<lower=0> N;
array[N] int<lower=0> y;
}
parameters {
real<lower=0> lambda;
}
model {
y ~ poisson(lambda);
lambda ~ normal(0, 10);
} stan/pois_re.stan
data {
int<lower=0> N;
array[N] int<lower=0> y;
}
parameters {
real log_lambda;
}
model {
y ~ poisson_log(log_lambda);
log_lambda ~ normal(0, 2.5);
} pois_mod <- cmdstan_model("stan/pois.stan")
pois_re_mod <- cmdstan_model("stan/pois_re.stan")pois_fit <- pois_mod$sample(
data = pois_list,
seed = 123,
chains = 4,
parallel_chains = 4,
iter_warmup = 1000, # number of warmup iterations
iter_sampling = 1000, # number of sampling iterations
refresh = 0 # print update every 500 iters
)
#> Running MCMC with 4 parallel chains...
#>
#> Chain 1 finished in 0.0 seconds.
#> Chain 2 finished in 0.0 seconds.
#> Chain 3 finished in 0.0 seconds.
#> Chain 4 finished in 0.0 seconds.
#>
#> All 4 chains finished successfully.
#> Mean chain execution time: 0.0 seconds.
#> Total execution time: 0.2 seconds.pois_re_fit <- pois_re_mod$sample(
data = pois_list,
seed = 123,
chains = 4,
parallel_chains = 4,
iter_warmup = 1000, # number of warmup iterations
iter_sampling = 1000, # number of sampling iterations
refresh = 0 # print update every 500 iters
)
#> Running MCMC with 4 parallel chains...
#>
#> Chain 1 finished in 0.0 seconds.
#> Chain 2 finished in 0.0 seconds.
#> Chain 3 finished in 0.0 seconds.
#> Chain 4 finished in 0.0 seconds.
#>
#> All 4 chains finished successfully.
#> Mean chain execution time: 0.0 seconds.
#> Total execution time: 0.2 seconds.pois_fit$summary()
#> # A tibble: 2 × 10
#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 lp__ 1836. 1836. 0.710 0.305 1834. 1836. 1.00 1898. 2556.
#> 2 lambda 12.2 12.2 0.349 0.344 11.6 12.8 1.00 1370. 1871.pois_re_fit$summary()
#> # A tibble: 2 × 10
#> variable mean median sd mad q5 q95 rhat ess_bulk ess_tail
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 lp__ 1833. 1834. 0.714 0.324 1.83e3 1.83e3 1.00 2170. 2574.
#> 2 log_lambda 2.50 2.50 0.0287 0.0295 2.46e0 2.55e0 1.00 1367. 2197.4.3 brms
pois_fit_brm <- brm(y ~ 1,
family = poisson(),
data = pois_df,
refresh = 0,
backend = "cmdstanr")
#> Running MCMC with 4 sequential chains...
#>
#> Chain 1 finished in 0.0 seconds.
#> Chain 2 finished in 0.0 seconds.
#> Chain 3 finished in 0.0 seconds.
#> Chain 4 finished in 0.0 seconds.
#>
#> All 4 chains finished successfully.
#> Mean chain execution time: 0.0 seconds.
#> Total execution time: 0.5 seconds.summary(pois_fit_brm)
#> Family: poisson
#> Links: mu = log
#> Formula: y ~ 1
#> Data: pois_df (Number of observations: 100)
#> Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
#> total post-warmup draws = 4000
#>
#> Regression Coefficients:
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> Intercept 2.50 0.03 2.44 2.56 1.00 1650 1850
#>
#> Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).5 Linear model
The simplest multiple linear regression model is the following, with two predictors (\(x_1\) and \(x_2\)), two slopes (\(\beta_1\) and \(\beta_2\)) and intercept coefficient (\(\beta_0\)), and normally distributed noise (\(\sigma\)). This model can be written using standard regression notation as
\[ y_i \sim N(\mu_i = \beta_0 + \beta_1 x_{1i} + \beta_2 x_{2i}, \sigma) \]
or
\[ y_i \sim N(\mu_i = \boldsymbol{\beta \cdot x_i}, \sigma) \]
where \(\boldsymbol{\beta}\) is a 1 \(\times\) 3 coefficient matrix (or row vector) and \(\boldsymbol{x_i}\) is a 3 \(\times\) 1 predictor matrix (including the intercept). We use \(\boldsymbol{x}\) (3 \(\times\) N) to compute the \(\mu\) vector with length \(N\).
Q: What are the posterior means and 95% credible intervals of the coefficients (\(\beta_0\), \(\beta_1\), \(\beta_2\))?
set.seed(123)
n <- 100
beta <- c(2, 1.2, -0.8)
x1 <- rnorm(n, mean = 0, sd = 1)
x2 <- rnorm(n, mean = 0, sd = 1)
sigma <- 0.4
y <- rnorm(n, mean = beta[1] + beta[2] * x1 + beta[3] * x2, sd = sigma)lm_list <- list(
y = y,
N = length(y),
x = rbind(1, x1, x2) # 3 x N matrix
)
lm_df <- tibble(
y = y,
N = length(y),
x1 = x1,
x2 = x2
)5.1 CmdStan
stan/lm.stan
data {
int<lower=0> N;
vector[N] y;
matrix[3, N] x;
}
parameters {
row_vector[3] beta;
real<lower=0> sigma;
}
model {
y ~ normal(beta * x, sigma);
beta ~ normal(0, 2.5);
sigma ~ cauchy(0, 2.5);
} lm_mod <- cmdstan_model("stan/lm.stan")
lm_fit <- lm_mod$sample(
data = lm_list,
seed = 123,
chains = 4,
parallel_chains = 4,
iter_warmup = 1000, # number of warmup iterations
iter_sampling = 1000, # number of sampling iterations
refresh = 0 # don't print update
)
#> Running MCMC with 4 parallel chains...
#>
#> Chain 1 finished in 0.0 seconds.
#> Chain 2 finished in 0.0 seconds.
#> Chain 3 finished in 0.0 seconds.
#> Chain 4 finished in 0.0 seconds.
#>
#> All 4 chains finished successfully.
#> Mean chain execution time: 0.0 seconds.
#> Total execution time: 0.2 seconds.Use posterior::summarise_draws to check more detailed summary statistics.
lm_summary <- posterior::summarise_draws(
lm_fit$draws(),
mean = ~mean(.x),
sd = ~sd(.x),
~posterior::quantile2(.x, probs = c(0.025, 0.05, 0.25, 0.5, 0.75, 0.95, 0.975))
)
lm_summary
#> # A tibble: 5 × 10
#> variable mean sd q2.5 q5 q25 q50 q75 q95 q97.5
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 lp__ 44.6 1.48 40.8 41.6 43.8 44.9 45.7 46.3 46.4
#> 2 beta[1] 2.05 0.0390 1.98 1.99 2.03 2.05 2.08 2.12 2.13
#> 3 beta[2] 1.15 0.0430 1.06 1.08 1.12 1.15 1.17 1.22 1.23
#> 4 beta[3] -0.790 0.0400 -0.870 -0.856 -0.816 -0.789 -0.763 -0.726 -0.715
#> 5 sigma 0.386 0.0292 0.334 0.342 0.365 0.384 0.405 0.437 0.4485.2 brms
lm_fit_brm <- brm(
y ~ x1 + x2,
family = gaussian(),
data = lm_df,
prior = c(
prior(normal(0, 2.5), class = "Intercept"),
prior(normal(0, 2.5), class = "b"),
prior(cauchy(0, 2.5), class = "sigma")
),
refresh = 0, # don't print update
backend = "cmdstanr"
)
#> Running MCMC with 4 sequential chains...
#>
#> Chain 1 finished in 0.0 seconds.
#> Chain 2 finished in 0.0 seconds.
#> Chain 3 finished in 0.0 seconds.
#> Chain 4 finished in 0.0 seconds.
#>
#> All 4 chains finished successfully.
#> Mean chain execution time: 0.0 seconds.
#> Total execution time: 0.5 seconds.
summary(lm_fit_brm)
#> Family: gaussian
#> Links: mu = identity
#> Formula: y ~ x1 + x2
#> Data: lm_df (Number of observations: 100)
#> Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
#> total post-warmup draws = 4000
#>
#> Regression Coefficients:
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> Intercept 2.05 0.04 1.97 2.13 1.00 4347 3260
#> x1 1.15 0.04 1.06 1.23 1.00 3995 2774
#> x2 -0.79 0.04 -0.87 -0.71 1.00 4302 2884
#>
#> Further Distributional Parameters:
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> sigma 0.39 0.03 0.34 0.45 1.00 4241 2851
#>
#> Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).lm_summary_brm <- posterior::summarise_draws(
posterior::as_draws_df(lm_fit_brm),
mean = ~mean(.x),
sd = ~sd(.x),
~posterior::quantile2(.x, probs = c(0.025, 0.05, 0.25, 0.5, 0.75, 0.95, 0.975))
)
lm_summary_brm
#> # A tibble: 7 × 10
#> variable mean sd q2.5 q5 q25 q50 q75 q95
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 b_Intercept 2.05 0.0394 1.97 1.99 2.03 2.05 2.08 2.12
#> 2 b_x1 1.15 0.0427 1.06 1.08 1.12 1.15 1.18 1.22
#> 3 b_x2 -0.790 0.0404 -0.868 -0.855 -0.816 -0.790 -0.762 -0.723
#> 4 sigma 0.386 0.0286 0.336 0.342 0.366 0.385 0.403 0.435
#> 5 Intercept 2.24 0.0391 2.17 2.18 2.22 2.24 2.27 2.31
#> 6 lprior -7.46 0.0172 -7.49 -7.48 -7.47 -7.45 -7.44 -7.43
#> 7 lp__ -54.3 1.51 -58.1 -57.2 -55.0 -53.9 -53.1 -52.5
#> # ℹ 1 more variable: q97.5 <dbl>6 Varying intercepts
H: There is a power-law relationship (\(y =\beta_0x^{\beta_1}\)) between tree diameter (DBH) and tree maximum height, and the scaling factor \(\beta_0\) varies among species.
\[ \text{log}\; y_{ij} \sim N(\text{log}\; \beta_{0j} + \beta_1 x_{i}, \sigma) \]
\[ \text{log}\; \beta_{0j} \sim N(\mu_0, \tau) \]
\[ \mu_0 \sim N(0, 2.5) \]
\[ \tau \sim \text{Half-Cauchy}(0, 1) \]
Q: What are the species-level scaling factors (\(\beta_{0j}\)) and the global scaling exponent (\(\beta_1\))?
6.1 Dummy data
- We simulate a dataset with 10 species and 20 trees per species.
- Each species has wood density (wd) values.
set.seed(12345)
n_sp <- 10
n_rep <- 20
wd <- rnorm(n_sp, 0, 1)
gamma0 <- 0.6
gamma1 <- 0.1
sigma_y <- 0.1
b1_hat <- gamma1 * wd + gamma0
b1 <- rnorm(n_sp, b1_hat, 0.01)
log_b0 <- rnorm(n_sp, 0.55, 0.05)
# ---- simulate ----
allo_df <- tibble(
sp = factor(rep(paste0("sp", 1:n_sp), each = n_rep)),
wd = rep(wd, each = n_rep),
# now log_xx ~ Normal(mean log-dbh, sd = 0.5)
log_xx = rnorm(n_sp * n_rep, mean = log(40), sd = 0.5)) |>
mutate(
# add observation-level noise on log-height
log_y = rnorm(
n(),
log_b0[as.integer(sp)] + b1[as.integer(sp)] * log_xx,
sigma_y),
dbh = exp(log_xx),
h = exp(log_y)) |>
select(sp, wd, dbh, h)
dbh_hist <- allo_df |>
ggplot(aes(dbh)) +
geom_histogram() +
xlab("DBH (cm)") +
theme(
legend.position = "none",
plot.title = element_text(size = 24)
)
h_hist <- allo_df |>
ggplot(aes(h)) +
geom_histogram() +
xlab("Height (m)") +
theme(
legend.position = "none",
plot.title = element_text(size = 24)
)
p1 <- allo_df |>
ggplot(aes(x = dbh, y = h, col = sp)) +
geom_point() +
xlab("DBH (cm)") +
ylab(expression("Height (m)")) +
theme(
legend.position = "none",
plot.title = element_text(size = 24)
)
p2 <- p1 +
scale_x_log10() +
scale_y_log10()
dbh_hist + h_hist + p1 + p2allo_list <- list(
log_h = log(allo_df$h),
log_dbh = log(allo_df$dbh),
sp = as.integer(allo_df$sp),
N = nrow(allo_df),
J = allo_df$sp |> unique() |> length()
)6.2 CmdStan
stan/vint.stan
data {
int<lower=1> N; // total number of observations
int<lower=1> J; // number of species
array[N] int<lower=1, upper=J> sp; // sp index
vector[N] log_dbh; // predictor: log(DBH)
vector[N] log_h; // response: log(height)
}
parameters {
real mu0; // grand mean of log(mu₀)
real<lower=0> tau; // SD of species intercepts
vector[J] log_beta0; // species‐level intercepts = log(β₀ⱼ)
real beta1; // common slope on log(DBH)
real<lower=0> sigma; // residual SD
}
model {
// Likelihood
log_h ~ normal(log_beta0[sp] + beta1 * log_dbh, sigma);
// Priors
mu0 ~ normal(0, 2.5);
tau ~ cauchy(0, 1);
log_beta0 ~ normal(mu0, tau);
beta1 ~ normal(0, 2.5);
sigma ~ cauchy(0, 1);
}
// we need this for model selection
generated quantities {
vector[N] log_lik;
for (i in 1:N)
log_lik[i] = normal_lpdf(log_h[i]
| log_beta0[sp[i]] + beta1 * log_dbh[i],
sigma);
} vint_mod <- cmdstan_model("stan/vint.stan")
vint_fit <- vint_mod$sample(
data = allo_list,
seed = 123,
chains = 4,
parallel_chains = 4,
iter_warmup = 1000, # number of warmup iterations
iter_sampling = 1000, # number of sampling iterations
refresh = 0 # don't print update
)
#> Running MCMC with 4 parallel chains...
#> Chain 1 finished in 0.2 seconds.
#> Chain 2 finished in 0.2 seconds.
#> Chain 3 finished in 0.2 seconds.
#> Chain 4 finished in 0.2 seconds.
#>
#> All 4 chains finished successfully.
#> Mean chain execution time: 0.2 seconds.
#> Total execution time: 0.3 seconds.
vint_fit$summary()
#> # A tibble: 215 × 10
#> variable mean median sd mad q5 q95 rhat ess_bulk
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 lp__ 365. 366. 2.69 2.59 360. 369. 1.00 1185.
#> 2 mu0 0.483 0.484 0.119 0.114 0.285 0.679 1.00 1081.
#> 3 tau 0.332 0.314 0.0912 0.0781 0.221 0.503 1.00 1508.
#> 4 log_beta0[1] 0.782 0.782 0.0565 0.0569 0.689 0.875 1.01 416.
#> 5 log_beta0[2] 0.921 0.923 0.0544 0.0542 0.832 1.01 1.01 419.
#> 6 log_beta0[3] 0.440 0.442 0.0555 0.0564 0.349 0.531 1.01 413.
#> 7 log_beta0[4] 0.265 0.266 0.0580 0.0597 0.171 0.359 1.01 393.
#> 8 log_beta0[5] 0.620 0.621 0.0538 0.0535 0.531 0.708 1.01 420.
#> 9 log_beta0[6] -0.0154 -0.0146 0.0545 0.0550 -0.105 0.0733 1.01 432.
#> 10 log_beta0[7] 0.710 0.712 0.0545 0.0546 0.621 0.798 1.01 420.
#> # ℹ 205 more rows
#> # ℹ 1 more variable: ess_tail <dbl>
vint_fit$diagnostic_summary()
#> $num_divergent
#> [1] 0 0 0 0
#>
#> $num_max_treedepth
#> [1] 0 0 0 0
#>
#> $ebfmi
#> [1] 1.0896007 0.9152977 1.0290832 1.00091256.3 brms
priors <- c(
prior(normal(0, 2.5), class = Intercept),
prior(normal(0, 2.5), class = b),
prior(cauchy(0, 1), class = sd),
prior(cauchy(0, 1), class = sigma)
)
vint_fit_brm <- brm(
log(h) ~ log(dbh) + (1 | sp),
family = gaussian(),
data = allo_df,
prior = priors,
seed = 123,
refresh = 0, # don't print update
backend = "cmdstanr"
)
#> Running MCMC with 4 sequential chains...
#>
#> Chain 1 finished in 0.5 seconds.
#> Chain 2 finished in 0.5 seconds.
#> Chain 3 finished in 0.5 seconds.
#> Chain 4 finished in 0.4 seconds.
#>
#> All 4 chains finished successfully.
#> Mean chain execution time: 0.5 seconds.
#> Total execution time: 2.3 seconds.
summary(vint_fit_brm)
#> Family: gaussian
#> Links: mu = identity
#> Formula: log(h) ~ log(dbh) + (1 | sp)
#> Data: allo_df (Number of observations: 200)
#> Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
#> total post-warmup draws = 4000
#>
#> Multilevel Hyperparameters:
#> ~sp (Number of levels: 10)
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> sd(Intercept) 0.33 0.09 0.21 0.55 1.01 714 1210
#>
#> Regression Coefficients:
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> Intercept 0.47 0.12 0.23 0.70 1.00 813 1101
#> logdbh 0.61 0.01 0.59 0.64 1.00 2035 1824
#>
#> Further Distributional Parameters:
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> sigma 0.10 0.01 0.09 0.11 1.00 2004 2100
#>
#> Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).plot(vint_fit_brm)6.4 lme4
vint_fit_lme <- lme4::lmer(
log(h) ~ log(dbh) + (1 | sp),
data = allo_df)
summary(vint_fit_lme)
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: log(h) ~ log(dbh) + (1 | sp)
#> Data: allo_df
#>
#> REML criterion at convergence: -298.3
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -2.66966 -0.62316 0.04254 0.60122 2.28315
#>
#> Random effects:
#> Groups Name Variance Std.Dev.
#> sp (Intercept) 0.084434 0.29057
#> Residual 0.009794 0.09897
#> Number of obs: 200, groups: sp, 10
#>
#> Fixed effects:
#> Estimate Std. Error t value
#> (Intercept) 0.47916 0.10454 4.584
#> log(dbh) 0.61154 0.01314 46.525
#>
#> Correlation of Fixed Effects:
#> (Intr)
#> log(dbh) -0.4726.5 Visualization and predictions
6.5.1 CmdStan
# Predicted DBH-height curves per species with 95% intervals
pred_grid <- allo_df |>
dplyr::group_by(sp) |>
dplyr::summarise(
dbh = list(seq(min(dbh), max(dbh), length.out = 60)),
.groups = "drop"
) |>
tidyr::unnest(dbh)
cmd_draws <- posterior::as_draws_matrix(vint_fit$draws(c("beta1", "log_beta0")))
beta1_draws <- cmd_draws[, "beta1"]
log_beta0_cols <- grep("^log_beta0\\[", colnames(cmd_draws), value = TRUE)
species_levels <- levels(allo_df$sp)DBH for predicted lines for each species.
pred_grid
#> # A tibble: 600 × 2
#> sp dbh
#> <fct> <dbl>
#> 1 sp1 12.2
#> 2 sp1 14.0
#> 3 sp1 15.8
#> 4 sp1 17.6
#> 5 sp1 19.5
#> 6 sp1 21.3
#> 7 sp1 23.1
#> 8 sp1 25.0
#> 9 sp1 26.8
#> 10 sp1 28.6
#> # ℹ 590 more rowsPosterior draws of the slope parameter (\(\beta_1\)).
str(beta1_draws)
#> 'draws_matrix' num [1:4000, 1] 0.638 0.635 0.62 0.627 0.629 ...
#> - attr(*, "dimnames")=List of 2
#> ..$ draw : chr [1:4000] "1" "2" "3" "4" ...
#> ..$ variable: chr "beta1"
#> - attr(*, "nchains")= int 4Column names for log-transformed species-level intercepts (\(\log \beta_{0j}\)). We use this to extract posterior draws for each species.
str(log_beta0_cols)
#> chr [1:10] "log_beta0[1]" "log_beta0[2]" "log_beta0[3]" "log_beta0[4]" ...pred_cmd_summary <- dplyr::bind_cols(
pred_grid,
purrr::map2_dfr(
match(pred_grid$sp, species_levels),
log(pred_grid$dbh),
~{
log_mu <- cmd_draws[, log_beta0_cols[.x]] + beta1_draws * .y
height_draws <- exp(log_mu)
tibble::tibble(
estimate = stats::median(height_draws),
lower = stats::quantile(height_draws, 0.025),
upper = stats::quantile(height_draws, 0.975)
)
}
)
)Compute predicted height summary for each species and DBH.
pred_cmd_summary
#> # A tibble: 600 × 5
#> sp dbh estimate lower upper
#> <fct> <dbl> <dbl> <dbl> <dbl>
#> 1 sp1 12.2 10.1 9.53 10.6
#> 2 sp1 14.0 11.0 10.4 11.6
#> 3 sp1 15.8 11.8 11.2 12.4
#> 4 sp1 17.6 12.6 12.0 13.3
#> 5 sp1 19.5 13.4 12.8 14.1
#> 6 sp1 21.3 14.2 13.5 14.9
#> 7 sp1 23.1 14.9 14.2 15.6
#> 8 sp1 25.0 15.6 14.9 16.3
#> 9 sp1 26.8 16.3 15.6 17.1
#> 10 sp1 28.6 17.0 16.2 17.8
#> # ℹ 590 more rowsggplot(pred_cmd_summary, aes(dbh, estimate, colour = sp)) +
geom_ribbon(aes(ymin = lower, ymax = upper, fill = sp), alpha = 0.15, colour = NA) +
geom_line() +
geom_point(
data = allo_df,
aes(dbh, h, colour = sp),
size = 1,
alpha = 0.6,
inherit.aes = FALSE
) +
scale_x_log10() +
scale_y_log10() +
guides(fill = "none") +
labs(
x = "DBH (cm)",
y = "Height (m)",
colour = "Species"
)6.5.2 brms
With fitted() available, preparing pred_summary is simpler than in the CmdStan workflow.
pred_summary <- fitted(vint_fit_brm, newdata = pred_grid, re_formula = NULL) |>
tibble::as_tibble() |>
dplyr::bind_cols(pred_grid) |>
dplyr::mutate(
estimate = exp(Estimate),
lower = exp(Q2.5),
upper = exp(Q97.5)
)
ggplot(pred_summary, aes(dbh, estimate, color = sp)) +
geom_ribbon(aes(ymin = lower, ymax = upper, fill = sp), alpha = 0.15, colour = NA) +
geom_line() +
geom_point(
data = allo_df,
aes(dbh, h, colour = sp),
size = 1,
alpha = 0.6,
inherit.aes = FALSE
) +
scale_x_log10() +
scale_y_log10() +
guides(fill = "none") +
labs(
x = "DBH (cm)",
y = "Height (m)",
colour = "Species"
)7 Reparameterization for multilevel models
Diagnosing Biased Inference with Divergences by Michael Betancourt
7.1 Eight schools example
What is the effect of the treatment (a new education method) on students’ scores for each school and across schools?
| school | y | sigma |
|---|---|---|
| School_1 | 28 | 15 |
| School_2 | 8 | 10 |
| School_3 | -3 | 16 |
| School_4 | 7 | 11 |
| School_5 | -1 | 9 |
| School_6 | 1 | 11 |
| School_7 | 18 | 10 |
| School_8 | 12 | 18 |
7.2 Centered parameterization
\[ y_j \sim \mathcal{N}\bigl(\theta_j,\;\sigma_j\bigr) \]
\[ \theta_j \sim \mathcal{N}\bigl(\mu,\;\tau \bigr) \]
\[ \mu \sim \mathcal{N}\bigl(0,\;5 \bigr) \]
\[ \tau \sim \text{Half-Cauchy}(0, 2.5) \]
Note: \(\sigma_j\) is known constant (i.e., data), we don’t have to estimate it.
This parameterization is intuitive, but it often leads to convergence issues in hierarchical models.
7.3 Non-centered parameterization
\[ \tilde{\theta_j} \sim \mathcal{N}\bigl(0,\;1 \bigr) \]
\[ \theta_j = \mu + \tau \cdot \tilde{\theta_j} \]
In this parameterization, we introduce a new latent variable \(\tilde{\theta_j}\) that is independent of the other parameters. This allows the sampler to explore the posterior more efficiently and avoids problems with convergence. We often use this parameterization when we have hierarchical models with varying intercepts or slopes.
8 Varying slopes and intercepts
H: There is a power-law relationship (\(y =\beta_0 x^{\beta_1}\)) between tree diameter (DBH) and tree maximum height, and both the scaling factor \(\beta_0\) and the exponent \(\beta_1\) vary among species.
\[ \text{log}\; y_{ij} \sim N(\text{log}\; \beta_{0j} + \beta_{1j} x_{ij}, \sigma) \]
\[ \begin{bmatrix} \text{log}\; \beta_{0j} \\ \beta_{1j} \end{bmatrix} \sim \mathrm{MVN} \Biggl( \boldsymbol{\mu} = \begin{bmatrix} \mu_{0} \\ \mu_{1} \end{bmatrix}, \boldsymbol{\Sigma} = \begin{bmatrix} \sigma_{0}^2 & \rho \sigma_{0} \sigma_{1} \\ \rho \sigma_{0} \sigma_{1} & \sigma_{1}^2 \end{bmatrix} \Biggr) \]
For reference, the multivariate normal (MVN) distribution looks like this:
Q: What are the species-level scaling factors (\(\beta_{0j}\)) and scaling exponent (\(\beta_{0j}\))? What are the global scaling exponent (\(\mu_0\)) and scaling exponent (\(\mu_1\))?
8.1 Centered parameterization
\[ \boldsymbol{\mu} \sim N(0, 2.5) \]
\[ \boldsymbol{\Sigma} \sim \mathcal{W}^{-1}_p(\nu,\,\Psi) \]
The inverse-whishart distribution \(\mathcal{W}^{-1}\) is a conjugate for the covariance matrix \(\boldsymbol{\Sigma}\) in MVN. However, despite conjugacy, inverse–Wishart priors are often computationally difficult to sample.
Recommendation: use a Cholesky factorization with independent priors on scales and correlation (below).
8.2 Non-Centered parameterization
Optimization through Cholesky factorization
\[ \boldsymbol{z}_j \sim \mathcal{N}(\mathbf0,\,I_2) \] \[ \begin{pmatrix}\log\beta_{0j}\\[3pt]\beta_{1j}\end{pmatrix} = \boldsymbol{\mu} + L\, \boldsymbol{z}_j \]
Some linear algebra (Cholesky decomposition):
\[ \Sigma \;=\;L\,L^\top \;=\;\bigl[\mathrm{diag}(\tau)\,L_\Omega\bigr]\, \bigl[L_\Omega^\top\,\mathrm{diag}(\tau)\bigr] \;=\;\mathrm{diag}(\tau)\,\Omega\,\mathrm{diag}(\tau). \]
\[ L_\Omega \sim \mathrm{LKJ\_Corr\_Cholesky}(2) \quad \text{or} \quad \Omega \sim \mathrm{LKJ\_Corr}(2) \] \[ \tau_i \sim \mathrm{Half\text{-}Cauchy}(0,1) \]
Rather than sampling the covariance matrix \(\Sigma\) or correlation matrix \(\Omega\) directly, sampling the Cholesky factor \(L_\Omega\) is computationally more efficient and numerically stable.
In the LKJ() prior:
\(\eta = 1\) indicates a flat prior (uniform distribution) over correlation matrices.
\(\eta > 1\) indicates weaker correlations.
We usually use \(\eta = 2\) because we assume at least weak correlations among the parameters while still allowing moderate to strong correlations.
8.3 CmdStan
stan/vslope.stan
data {
int<lower=0> N; // num individuals
int<lower=1> K; // num ind predictors
int<lower=1> J; // num groups
array[N] int<lower=1, upper=J> jj; // group for individual
int<lower=1> L; // number of group-level predictors
matrix[N, K] x; // individual predictors
matrix[L, J] u; // group predictors
vector[N] y; // outcomes
}
parameters {
matrix[K, L] gamma; // coefficients across groups
real<lower=0> sigma; // prediction error scale
vector<lower=0>[K] tau; // prior scale
matrix[K, J] z;
cholesky_factor_corr[K] L_Omega;
}
transformed parameters {
matrix[K, J] beta;
// mu = gamma * u
beta = gamma * u + diag_pre_multiply(tau, L_Omega) * z;
}
model {
to_vector(z) ~ std_normal();
// tau ~ cauchy(0, 2.5);
tau ~ normal(0, 2.5);
L_Omega ~ lkj_corr_cholesky(2);
to_vector(gamma) ~ normal(0, 2.5);
for (n in 1:N) {
y[n] ~ normal(x[n, ] * beta[, jj[n]], sigma);
}
} allo_vslope_list <- list(
y = log(allo_df$h),
x = cbind(1, log(allo_df$dbh)),
jj = as.integer(allo_df$sp),
N = nrow(allo_df),
K = 2, # number of predictors (intercept + slope)
J = allo_df$sp |> unique() |> length(),
L = 1 # number of group-level predictors (intercept)
)
allo_vslope_list$u <- matrix(1, nrow = allo_vslope_list$L, ncol = allo_vslope_list$J)vslope_mod <- cmdstan_model("stan/vslope.stan")
vslope_fit <- vslope_mod$sample(
data = allo_vslope_list,
seed = 1234,
chains = 4,
parallel_chains = 4,
iter_warmup = 1000, # number of warmup iterations
iter_sampling = 1000, # number of sampling iterations
adapt_delta = 0.95, # increase adapt_delta to avoid divergent transitions
max_treedepth = 15, # increase max_treedepth to avoid max treedepth errors
refresh = 0, # don't print update
show_messages = FALSE # suppress Stan’s “Informational Message” output
)
# vslope_fit$summary()
vslope_summary <- posterior::summarise_draws(
vslope_fit$draws(),
mean = ~mean(.x),
sd = ~sd(.x),
~posterior::quantile2(.x, probs = c(0.025, 0.05, 0.25, 0.5, 0.75, 0.95, 0.975))
)
vslope_summary
#> # A tibble: 50 × 10
#> variable mean sd q2.5 q5 q25 q50 q75 q95
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 lp__ 3.60e+2 4.50 3.50e+2 3.52e+2 357. 3.60e+2 3.63e+2 366.
#> 2 gamma[1,1] 5.02e-1 0.0547 3.97e-1 4.13e-1 0.466 5.04e-1 5.37e-1 0.590
#> 3 gamma[2,1] 6.06e-1 0.0310 5.44e-1 5.56e-1 0.587 6.06e-1 6.25e-1 0.658
#> 4 sigma 9.24e-2 0.00469 8.37e-2 8.49e-2 0.0892 9.23e-2 9.55e-2 0.100
#> 5 tau[1] 6.96e-2 0.0553 2.93e-3 5.53e-3 0.0276 5.86e-2 9.79e-2 0.171
#> 6 tau[2] 8.84e-2 0.0280 4.88e-2 5.27e-2 0.0694 8.41e-2 1.02e-1 0.140
#> 7 z[1,1] 5.80e-1 0.900 -1.37e+0 -1.01e+0 0.0366 6.36e-1 1.19e+0 1.98
#> 8 z[2,1] 6.58e-1 0.628 -7.80e-1 -4.61e-1 0.305 7.11e-1 1.06e+0 1.59
#> 9 z[1,2] 1.61e-1 0.994 -1.86e+0 -1.52e+0 -0.515 1.87e-1 8.53e-1 1.72
#> 10 z[2,2] 1.29e+0 0.667 -1.37e-1 1.73e-1 0.890 1.31e+0 1.72e+0 2.35
#> # ℹ 40 more rows
#> # ℹ 1 more variable: q97.5 <dbl>vslope_fit$diagnostic_summary()
#> $num_divergent
#> [1] 0 0 0 0
#>
#> $num_max_treedepth
#> [1] 0 0 0 0
#>
#> $ebfmi
#> [1] 0.8183985 0.7952493 0.8137258 0.81791998.4 brms
priors <- c(
prior(normal(0, 2.5), class = Intercept),
prior(normal(0, 2.5), class = b),
prior(lkj(2), class = cor), # Ω ~ LKJ(2)
prior(cauchy(0, 1), class = sigma)
)
slope_fit_brm <- brm(
log(h) ~ log(dbh) + (1 + log(dbh) | sp),
family = gaussian(),
data = allo_df,
prior = priors,
refresh = 0, # don't print update
control = list(
adapt_delta = 0.95, # default is 0.8
max_treedepth = 15 # default is 10
),
backend = "cmdstanr"
)
#> Running MCMC with 4 sequential chains...
#>
#> Chain 1 finished in 3.2 seconds.
#> Chain 2 finished in 3.8 seconds.
#> Chain 3 finished in 3.7 seconds.
#> Chain 4 finished in 3.6 seconds.
#>
#> All 4 chains finished successfully.
#> Mean chain execution time: 3.6 seconds.
#> Total execution time: 14.7 seconds.Divergent transitions! We need to increase adapt_delta and max_treedepth to avoid divergent transitions. We may also need to change the priors. Reparameterization is less straightforward in brms than writing your own Stan code.
summary(slope_fit_brm)
#> Warning: There were 23 divergent transitions after warmup. Increasing
#> adapt_delta above 0.95 may help. See
#> http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> Family: gaussian
#> Links: mu = identity
#> Formula: log(h) ~ log(dbh) + (1 + log(dbh) | sp)
#> Data: allo_df (Number of observations: 200)
#> Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
#> total post-warmup draws = 4000
#>
#> Multilevel Hyperparameters:
#> ~sp (Number of levels: 10)
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
#> sd(Intercept) 0.07 0.06 0.00 0.21 1.00 2079
#> sd(logdbh) 0.09 0.03 0.05 0.16 1.00 1023
#> cor(Intercept,logdbh) 0.09 0.42 -0.72 0.83 1.00 806
#> Tail_ESS
#> sd(Intercept) 1896
#> sd(logdbh) 532
#> cor(Intercept,logdbh) 1347
#>
#> Regression Coefficients:
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> Intercept 0.50 0.05 0.39 0.61 1.00 3102 773
#> logdbh 0.61 0.03 0.55 0.67 1.01 1079 389
#>
#> Further Distributional Parameters:
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> sigma 0.09 0.00 0.08 0.10 1.00 4735 2977
#>
#> Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).8.5 lme4
slope_fit_lme <- lme4::lmer(
log(h) ~ log(dbh) + (1 + log(dbh) | sp),
data = allo_df)Not convergent!
summary(slope_fit_lme)
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: log(h) ~ log(dbh) + (1 + log(dbh) | sp)
#> Data: allo_df
#>
#> REML criterion at convergence: -327.2
#>
#> Scaled residuals:
#> Min 1Q Median 3Q Max
#> -3.14307 -0.63987 0.08714 0.61696 3.03735
#>
#> Random effects:
#> Groups Name Variance Std.Dev. Corr
#> sp (Intercept) 0.001326 0.03641
#> log(dbh) 0.004607 0.06787 1.00
#> Residual 0.008397 0.09164
#> Number of obs: 200, groups: sp, 10
#>
#> Fixed effects:
#> Estimate Std. Error t value
#> (Intercept) 0.49980 0.04759 10.50
#> log(dbh) 0.60637 0.02468 24.57
#>
#> Correlation of Fixed Effects:
#> (Intr)
#> log(dbh) -0.264
#> optimizer (nloptwrap) convergence code: 0 (OK)
#> boundary (singular) fit: see help('isSingular')9 Varying slopes and intercepts, and group-level predictors
H: There is a power-law relationship (\(y =\beta_0 x^{\beta_1}\)) between tree diameter (DBH) and tree maximum height, and both the scaling factor \(\beta_0\) and the exponent \(\beta_1\) vary among species. Those species-level variations can be explained by wood density (group-level predictor: \(u\)) of each species.
\[ \text{log}\; y_{ij} \sim N(\text{log}\; \beta_{0j} + \beta_{1j} x_{ij}, \sigma) \]
\[ \begin{bmatrix} \text{log}\; \beta_{0j} \\ \beta_{1j} \end{bmatrix} \sim \mathrm{MVN} \Biggl( \underbrace{ \begin{bmatrix} \gamma_{0}^{(\beta_0)} + \gamma_{1}^{(\beta_0)}u_{j} \\ \gamma_{0}^{(\beta_1)} + \gamma_{1}^{(\beta_1)}u_{j} \end{bmatrix}}_{\text{mean depends on }u_j}, \begin{bmatrix} \sigma_{0}^2 & \rho \sigma_{0} \sigma_{1} \\ \rho \sigma_{0} \sigma_{1} & \sigma_{1}^2 \end{bmatrix} \Biggr) \]
Q: What are the coefficients for species-level regressions (\(\gamma_{0}^{(\beta_0)}\), \(\gamma_{1}^{(\beta_0}\), \(\gamma_{0}^{(\beta_1)}\), and \(\gamma_{1}^{(\beta_1}\))?
Now we have two group-level predictors (intercept and slope), and \(u\) includes the intercept and wood density. The rest of this section follows the setup from the varying slopes and intercepts section.
allo_vslope_list$L <- 2
allo_vslope_list$u <- rbind(
rep(1, allo_vslope_list$J),
trait
)
str(allo_vslope_list)
#> List of 8
#> $ y : num [1:200] 3.29 3.66 3.7 3.58 3.19 ...
#> $ x : num [1:200, 1:2] 1 1 1 1 1 1 1 1 1 1 ...
#> $ jj: int [1:200] 1 1 1 1 1 1 1 1 1 1 ...
#> $ N : int 200
#> $ K : num 2
#> $ J : int 10
#> $ L : num 2
#> $ u : num [1:2, 1:10] 1 0.586 1 0.709 1 ...
#> ..- attr(*, "dimnames")=List of 2
#> .. ..$ : chr [1:2] "" "trait"
#> .. ..$ : NULLvgrp_fit <- vslope_mod$sample(
data = allo_vslope_list,
seed = 1234,
chains = 4,
parallel_chains = 4,
iter_warmup = 1000, # number of warmup iterations
iter_sampling = 1000, # number of sampling iterations
adapt_delta = 0.95, # increase adapt_delta to avoid divergent transitions
max_treedepth = 15, # increase max_treedepth to avoid max treedepth errors
refresh = 0 # don't print update
)
#> Running MCMC with 4 parallel chains...
#> Chain 4 finished in 4.0 seconds.
#> Chain 2 finished in 4.6 seconds.
#> Chain 1 finished in 4.7 seconds.
#> Chain 3 finished in 5.2 seconds.
#>
#> All 4 chains finished successfully.
#> Mean chain execution time: 4.6 seconds.
#> Total execution time: 5.2 seconds.
vgrp_fit$summary()
#> # A tibble: 52 × 10
#> variable mean median sd mad q5 q95 rhat ess_bulk
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 lp__ 358. 358. 4.51 4.36 350. 3.65e+2 1.00 1004.
#> 2 gamma[1,1] 0.503 0.504 0.0516 0.0498 0.420 5.87e-1 1.00 3069.
#> 3 gamma[2,1] 0.618 0.618 0.0148 0.0148 0.593 6.41e-1 1.00 2703.
#> 4 gamma[1,2] 0.0414 0.0410 0.0627 0.0605 -0.0630 1.46e-1 1.00 3220.
#> 5 gamma[2,2] 0.0805 0.0802 0.0183 0.0177 0.0512 1.10e-1 1.000 2617.
#> 6 sigma 0.0926 0.0924 0.00494 0.00483 0.0847 1.01e-1 1.00 5732.
#> 7 tau[1] 0.0641 0.0565 0.0459 0.0459 0.00552 1.47e-1 1.00 1564.
#> 8 tau[2] 0.0246 0.0235 0.0131 0.0117 0.00500 4.76e-2 1.00 1210.
#> 9 z[1,1] 0.409 0.430 0.814 0.773 -0.897 1.71e+0 1.00 3208.
#> 10 z[2,1] 0.171 0.221 0.760 0.715 -1.16 1.35e+0 1.000 2873.
#> # ℹ 42 more rows
#> # ℹ 1 more variable: ess_tail <dbl>
vgrp_fit$diagnostic_summary()
#> $num_divergent
#> [1] 1 0 0 0
#>
#> $num_max_treedepth
#> [1] 0 0 0 0
#>
#> $ebfmi
#> [1] 0.8973376 0.7535103 0.9068761 0.77408179.1 brms
priors <- c(
prior(normal(0, 2.5), class = Intercept),
prior(normal(0, 2.5), class = b),
prior(lkj(2), class = cor), # Ω ~ LKJ(2)
prior(cauchy(0, 1), class = sigma)
)
vgrp_fit_brm <- brm(
log(h) ~ log(dbh) * wd + (1 + log(dbh) | sp),
family = gaussian(),
data = allo_df,
prior = priors,
refresh = 0, # don't print update
control = list(
adapt_delta = 0.95, # default is 0.8
max_treedepth = 15 # default is 10
),
backend = "cmdstanr"
)
#> Running MCMC with 4 sequential chains...
#>
#> Chain 1 finished in 4.2 seconds.
#> Chain 2 finished in 4.7 seconds.
#> Chain 3 finished in 4.5 seconds.
#> Chain 4 finished in 4.9 seconds.
#>
#> All 4 chains finished successfully.
#> Mean chain execution time: 4.6 seconds.
#> Total execution time: 18.6 seconds.summary(vgrp_fit_brm)
#> Warning: There were 5 divergent transitions after warmup. Increasing
#> adapt_delta above 0.95 may help. See
#> http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> Family: gaussian
#> Links: mu = identity
#> Formula: log(h) ~ log(dbh) * wd + (1 + log(dbh) | sp)
#> Data: allo_df (Number of observations: 200)
#> Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
#> total post-warmup draws = 4000
#>
#> Multilevel Hyperparameters:
#> ~sp (Number of levels: 10)
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
#> sd(Intercept) 0.09 0.07 0.00 0.25 1.00 1698
#> sd(logdbh) 0.08 0.03 0.04 0.15 1.00 1335
#> cor(Intercept,logdbh) 0.13 0.42 -0.69 0.85 1.01 865
#> Tail_ESS
#> sd(Intercept) 1867
#> sd(logdbh) 1909
#> cor(Intercept,logdbh) 1516
#>
#> Regression Coefficients:
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> Intercept 0.50 0.06 0.38 0.62 1.00 3835 2984
#> logdbh 0.60 0.03 0.54 0.66 1.00 1949 1867
#> wd 0.03 0.07 -0.11 0.18 1.00 3489 2388
#> logdbh:wd -0.05 0.04 -0.12 0.03 1.00 2148 2121
#>
#> Further Distributional Parameters:
#> Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> sigma 0.09 0.00 0.08 0.10 1.00 4956 2552
#>
#> Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).9.2 lme4
grp_fit_lme <- lme4::lmer(
log(h) ~ log(dbh) * wd + (1 + log(dbh) | sp),
data = allo_df)10 References
Betancourt, M. (2017). Diagnosing Biased Inference with Divergences.
Gelman, A. et al. Bayesian Data Analysis, Third Edition. (Chapman & Hall/CRC, 2013)
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.5.1 (2025-06-13)
#> os Ubuntu 24.04.3 LTS
#> system x86_64, linux-gnu
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Etc/UTC
#> date 2025-10-31
#> pandoc 3.8.1 @ /usr/bin/ (via rmarkdown)
#> quarto 1.7.32 @ /usr/local/bin/quarto
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> ! package * version date (UTC) lib source
#> P abind 1.4-8 2024-09-12 [?] RSPM (R 4.5.0)
#> P backports 1.5.0 2024-05-23 [?] RSPM (R 4.5.0)
#> P bayesplot * 1.14.0 2025-08-31 [?] RSPM (R 4.5.0)
#> P boot 1.3-32 2025-08-29 [?] RSPM (R 4.5.0)
#> P bridgesampling 1.1-2 2021-04-16 [?] RSPM (R 4.5.0)
#> P brms * 2.23.0 2025-09-09 [?] RSPM (R 4.5.0)
#> P Brobdingnag 1.2-9 2022-10-19 [?] RSPM (R 4.5.0)
#> P cachem 1.1.0 2024-05-16 [?] RSPM (R 4.5.0)
#> P checkmate 2.3.3 2025-08-18 [?] RSPM (R 4.5.0)
#> P cli 3.6.5 2025-04-23 [?] RSPM (R 4.5.0)
#> P cmdstanr * 0.9.0 2025-03-30 [?] repository (https://github.com/stan-dev/cmdstanr@da99e2b)
#> P coda 0.19-4.1 2024-01-31 [?] RSPM (R 4.5.0)
#> codetools 0.2-20 2024-03-31 [3] CRAN (R 4.5.1)
#> P data.table 1.17.8 2025-07-10 [?] RSPM (R 4.5.0)
#> devtools 2.4.6 2025-10-03 [2] RSPM (R 4.5.0)
#> P digest 0.6.37 2024-08-19 [?] RSPM (R 4.5.0)
#> P distributional 0.5.0 2024-09-17 [?] RSPM (R 4.5.0)
#> P dplyr * 1.1.4 2023-11-17 [?] RSPM (R 4.5.0)
#> ellipsis 0.3.2 2021-04-29 [2] RSPM (R 4.5.0)
#> P evaluate 1.0.5 2025-08-27 [?] RSPM (R 4.5.0)
#> P farver 2.1.2 2024-05-13 [?] RSPM (R 4.5.0)
#> P fastmap 1.2.0 2024-05-15 [?] RSPM (R 4.5.0)
#> P forcats * 1.0.1 2025-09-25 [?] RSPM (R 4.5.0)
#> P fs 1.6.6 2025-04-12 [?] RSPM (R 4.5.0)
#> P generics 0.1.4 2025-05-09 [?] RSPM (R 4.5.0)
#> P ggplot2 * 4.0.0 2025-09-11 [?] RSPM (R 4.5.0)
#> P glue 1.8.0 2024-09-30 [?] RSPM (R 4.5.0)
#> P gridExtra 2.3 2017-09-09 [?] RSPM (R 4.5.0)
#> P gtable 0.3.6 2024-10-25 [?] RSPM (R 4.5.0)
#> P hms 1.1.4 2025-10-17 [?] RSPM (R 4.5.0)
#> P htmltools 0.5.8.1 2024-04-04 [?] RSPM (R 4.5.0)
#> htmlwidgets 1.6.4 2023-12-06 [2] RSPM (R 4.5.0)
#> P inline 0.3.21 2025-01-09 [?] RSPM (R 4.5.0)
#> P isoband 0.2.7 2022-12-20 [?] RSPM (R 4.5.0)
#> P jsonlite 2.0.0 2025-03-27 [?] RSPM (R 4.5.0)
#> P kableExtra 1.4.0 2024-01-24 [?] RSPM (R 4.5.0)
#> P knitr * 1.50 2025-03-16 [?] RSPM (R 4.5.0)
#> P labeling 0.4.3 2023-08-29 [?] RSPM (R 4.5.0)
#> lattice 0.22-7 2025-04-02 [3] CRAN (R 4.5.1)
#> P lifecycle 1.0.4 2023-11-07 [?] RSPM (R 4.5.0)
#> P lme4 1.1-37 2025-03-26 [?] RSPM (R 4.5.0)
#> P loo 2.8.0 2024-07-03 [?] RSPM (R 4.5.0)
#> P lubridate * 1.9.4 2024-12-08 [?] RSPM (R 4.5.0)
#> P magrittr 2.0.4 2025-09-12 [?] RSPM (R 4.5.0)
#> MASS 7.3-65 2025-02-28 [3] CRAN (R 4.5.1)
#> P Matrix 1.7-4 2025-08-28 [?] RSPM (R 4.5.0)
#> P matrixStats 1.5.0 2025-01-07 [?] RSPM (R 4.5.0)
#> P memoise 2.0.1 2021-11-26 [?] RSPM (R 4.5.0)
#> mgcv 1.9-3 2025-04-04 [3] CRAN (R 4.5.1)
#> P minqa 1.2.8 2024-08-17 [?] RSPM (R 4.5.0)
#> P mvtnorm 1.3-3 2025-01-10 [?] RSPM (R 4.5.0)
#> nlme 3.1-168 2025-03-31 [3] CRAN (R 4.5.1)
#> P nloptr 2.2.1 2025-03-17 [?] RSPM (R 4.5.0)
#> P patchwork * 1.3.2 2025-08-25 [?] RSPM (R 4.5.0)
#> P pillar 1.11.1 2025-09-17 [?] RSPM (R 4.5.0)
#> P pkgbuild 1.4.8 2025-05-26 [?] RSPM (R 4.5.0)
#> P pkgconfig 2.0.3 2019-09-22 [?] RSPM (R 4.5.0)
#> pkgload 1.4.1 2025-09-23 [2] RSPM (R 4.5.0)
#> P plyr 1.8.9 2023-10-02 [?] RSPM (R 4.5.0)
#> P posterior 1.6.1 2025-02-27 [?] RSPM (R 4.5.0)
#> P processx 3.8.6 2025-02-21 [?] RSPM (R 4.5.0)
#> P ps 1.9.1 2025-04-12 [?] RSPM (R 4.5.0)
#> P purrr * 1.1.0 2025-07-10 [?] RSPM (R 4.5.0)
#> P QuickJSR 1.8.1 2025-09-20 [?] RSPM (R 4.5.0)
#> P R6 2.6.1 2025-02-15 [?] RSPM (R 4.5.0)
#> P rbibutils 2.3 2024-10-04 [?] RSPM (R 4.5.0)
#> P RColorBrewer 1.1-3 2022-04-03 [?] RSPM (R 4.5.0)
#> P Rcpp * 1.1.0 2025-07-02 [?] RSPM (R 4.5.0)
#> P RcppParallel 5.1.11-1 2025-08-27 [?] RSPM (R 4.5.0)
#> P Rdpack 2.6.4 2025-04-09 [?] RSPM (R 4.5.0)
#> P readr * 2.1.5 2024-01-10 [?] RSPM (R 4.5.0)
#> P reformulas 0.4.1 2025-04-30 [?] RSPM (R 4.5.0)
#> remotes 2.5.0 2024-03-17 [2] RSPM (R 4.5.0)
#> renv 1.1.5 2025-07-24 [1] RSPM (R 4.5.0)
#> P reshape2 1.4.4 2020-04-09 [?] RSPM (R 4.5.0)
#> P rlang 1.1.6 2025-04-11 [?] RSPM (R 4.5.0)
#> P rmarkdown 2.30 2025-09-28 [?] RSPM (R 4.5.0)
#> P rstan 2.32.7 2025-03-10 [?] RSPM (R 4.5.0)
#> P rstantools 2.5.0 2025-09-01 [?] RSPM (R 4.5.0)
#> P rstudioapi 0.17.1 2024-10-22 [?] RSPM (R 4.5.0)
#> P S7 0.2.0 2024-11-07 [?] RSPM (R 4.5.0)
#> P scales 1.4.0 2025-04-24 [?] RSPM (R 4.5.0)
#> sessioninfo 1.2.3 2025-02-05 [2] RSPM (R 4.5.0)
#> P StanHeaders 2.32.10 2024-07-15 [?] RSPM (R 4.5.0)
#> P stringi 1.8.7 2025-03-27 [?] RSPM (R 4.5.0)
#> P stringr * 1.5.2 2025-09-08 [?] RSPM (R 4.5.0)
#> P svglite 2.2.1 2025-05-12 [?] RSPM (R 4.5.0)
#> P systemfonts 1.3.1 2025-10-01 [?] RSPM (R 4.5.0)
#> P tensorA 0.36.2.1 2023-12-13 [?] RSPM (R 4.5.0)
#> P textshaping 1.0.4 2025-10-10 [?] RSPM (R 4.5.0)
#> P tibble * 3.3.0 2025-06-08 [?] RSPM (R 4.5.0)
#> P tictoc * 1.2.1 2024-03-18 [?] RSPM (R 4.5.0)
#> P tidyr * 1.3.1 2024-01-24 [?] RSPM (R 4.5.0)
#> P tidyselect 1.2.1 2024-03-11 [?] RSPM (R 4.5.0)
#> P tidyverse * 2.0.0 2023-02-22 [?] RSPM (R 4.5.0)
#> P timechange 0.3.0 2024-01-18 [?] RSPM (R 4.5.0)
#> P tzdb 0.5.0 2025-03-15 [?] RSPM (R 4.5.0)
#> usethis 3.2.1 2025-09-06 [2] RSPM (R 4.5.0)
#> P utf8 1.2.6 2025-06-08 [?] RSPM (R 4.5.0)
#> P vctrs 0.6.5 2023-12-01 [?] RSPM (R 4.5.0)
#> P viridisLite 0.4.2 2023-05-02 [?] RSPM (R 4.5.0)
#> P withr 3.0.2 2024-10-28 [?] RSPM (R 4.5.0)
#> P xfun 0.53 2025-08-19 [?] RSPM (R 4.5.0)
#> P xml2 1.4.1 2025-10-27 [?] RSPM (R 4.5.0)
#> P yaml 2.3.10 2024-07-26 [?] RSPM (R 4.5.0)
#>
#> [1] /workspaces/afec-bayes-2025/renv/library/linux-ubuntu-noble/R-4.5/x86_64-pc-linux-gnu
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/local/lib/R/library
#>
#> * ── Packages attached to the search path.
#> P ── Loaded and on-disk path mismatch.
#>
#> ──────────────────────────────────────────────────────────────────────────────